Skip to content

task: add task_context propagation support#268

Closed
ballard26 wants to merge 1 commit intoredpanda-data:v26.1.xfrom
ballard26:task-context
Closed

task: add task_context propagation support#268
ballard26 wants to merge 1 commit intoredpanda-data:v26.1.xfrom
ballard26:task-context

Conversation

@ballard26
Copy link
Copy Markdown

@ballard26 ballard26 commented Apr 14, 2026

This PR adds a lw_shared_ptr<task_context> field to seastar::task that propagates
automatically through the task chain. task_context is a polymorphic base
class (using enable_lw_shared_from_this<task_context> with a virtual
destructor) that downstream users subclass for their specific needs. The
field adds 8 bytes to the task struct.

Context propagation uses a thread-local pointer (current_task_context_ptr)
as the authority, matching how current_scheduling_group() works:

  • Reactor loop: sets TLS from task._context before each task runs,
    clears after.

  • Task constructor: reads TLS into _context at creation time (inline).

  • context_guard CRTP bases: two variants for installing context:

    coroutine::context_guard — co_await-able, sets TLS on
    construction and the coroutine promise via hndl.promise() on
    co_await. Destructor restores TLS and syncs back to the promise.
    Move constructor is private to prevent dangling promise pointers.

    scoped_context_guard — plain RAII for .then() chains,
    sets/restores TLS only. Fully movable for do_with.

Cross-shard work items (smp::submit_to) use no_context_tag to skip
context inheritance since lw_shared_ptr is not thread-safe across shards.

Known limitation: context changes inside a .then() continuation (e.g.
creating a child span or clearing context) do not propagate to subsequent
continuations in the same chain, because all continuations are constructed
at chain-setup time and each inherits its own copy. Fixing this would
require schedule-time propagation in add_task(), which introduces
cross-contamination from unrelated tasks that resolve shared promises.
The tracing use case is unaffected — RAII guards span the entire chain
lifetime via do_with().

The feature is guarded by the SEASTAR_TASK_CONTEXT compile flag (ON by
default), following the existing SEASTAR_TASK_BACKTRACE pattern so it
compiles out with zero overhead for users who don't need it.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds optional task-scoped context (lw_shared_ptr<task_context>) to seastar::task and wires up automatic propagation through task creation and coroutine suspension points, guarded by a new SEASTAR_TASK_CONTEXT compile flag (enabled by default).

Changes:

  • Introduces seastar::task_context and stores/propagates it via seastar::task constructors and internal::propagate_task_context().
  • Updates coroutine awaiters to propagate context into coroutine promises at await_suspend().
  • Adds unit tests and build/config toggles for enabling/disabling the feature.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/unit/task_context_test.cc New unit tests validating task context APIs and propagation behavior.
tests/unit/CMakeLists.txt Adds the new task_context unit test target.
src/core/reactor.cc Implements internal::inherit_task_context() and internal::propagate_task_context().
include/seastar/core/task.hh Adds task_context, stores it in task, and exposes get/set/take APIs plus propagation hooks.
include/seastar/core/coroutine.hh Coroutine promise skips ctor inheritance; awaiters propagate context in await_suspend().
include/seastar/coroutine/as_future.hh Propagates context in as_future awaiter suspend path.
include/seastar/coroutine/maybe_yield.hh Propagates context before scheduling resume.
include/seastar/coroutine/switch_to.hh Propagates context before scheduling-group switch scheduling.
include/seastar/coroutine/parallel_for_each.hh Propagates context before resuming the awaiting coroutine.
include/seastar/coroutine/try_future.hh Propagates context before resuming/destroying coroutine in success path and ready path.
include/seastar/core/smp.hh Prevents cross-shard inheritance for work_item via no_context_tag.
configure.py Adds --task-context tristate configure option.
CMakeLists.txt Adds Seastar_TASK_CONTEXT option and exports SEASTAR_TASK_CONTEXT define when enabled.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/unit/task_context_test.cc Outdated
Comment thread tests/unit/task_context_test.cc Outdated
@ballard26 ballard26 force-pushed the task-context branch 3 times, most recently from 543458b to 1244c4d Compare April 14, 2026 15:54
@StephanDollberg
Copy link
Copy Markdown
Member

StephanDollberg commented Apr 14, 2026

Can you use benches as changed here: 4fa3c1a

and also rp multiplexer bench as a more real life bench

@ballard26 ballard26 force-pushed the task-context branch 2 times, most recently from 56713e4 to 88e9361 Compare April 15, 2026 00:20
@ballard26 ballard26 requested a review from Copilot April 15, 2026 00:21
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread include/seastar/core/context_guard.hh Outdated
}

scoped_context_guard(scoped_context_guard&&) noexcept = default;
scoped_context_guard& operator=(scoped_context_guard&&) noexcept = default;
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scoped_context_guard defaulted move-assignment is not safe for an RAII guard: move-assigning into an already-active guard will overwrite _prev/_ctx without first restoring the currently-installed TLS, potentially leaving TLS permanently set to the wrong context. Consider deleting move-assignment (like many other scoped RAII helpers in the codebase) or implementing a custom move-assignment that restores the current guard before taking ownership.

Suggested change
scoped_context_guard& operator=(scoped_context_guard&&) noexcept = default;
scoped_context_guard& operator=(scoped_context_guard&&) = delete;

Copilot uses AI. Check for mistakes.
Comment thread src/core/reactor.cc Outdated
Comment on lines +2746 to +2754
r._current_task = tsk;
#ifdef SEASTAR_TASK_CONTEXT
internal::current_task_context_ref() = tsk->context_ptr();
#endif
tsk->run_and_dispose();
r._current_task = nullptr;
#ifdef SEASTAR_TASK_CONTEXT
internal::current_task_context_ref() = nullptr;
#endif
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Task-context TLS is set/cleared when the reactor runs tasks, but internal::set_current_task() is used elsewhere (e.g. coroutine helpers like include/seastar/coroutine/parallel_for_each.hh and try_future.hh, and src/core/thread.cc) to change _current_task without returning to the reactor loop. Since set_current_task() does not update internal::current_task_context_ref(), current_task_context() can become stale (or non-null when _current_task is set to null). Please update set_current_task() to also set current_task_context_ref() to t ? t->context_ptr() : nullptr when SEASTAR_TASK_CONTEXT is enabled.

Suggested change
r._current_task = tsk;
#ifdef SEASTAR_TASK_CONTEXT
internal::current_task_context_ref() = tsk->context_ptr();
#endif
tsk->run_and_dispose();
r._current_task = nullptr;
#ifdef SEASTAR_TASK_CONTEXT
internal::current_task_context_ref() = nullptr;
#endif
internal::set_current_task(tsk);
tsk->run_and_dispose();
internal::set_current_task(nullptr);

Copilot uses AI. Check for mistakes.
Comment thread include/seastar/core/task.hh
Comment thread include/seastar/core/context_guard.hh Outdated
Comment on lines +24 to +47
#include <seastar/core/task.hh>

#include <coroutine>

namespace seastar {

/// \brief RAII guard for task context in .then() chains.
///
/// Sets the TLS on construction, restores on destruction. New tasks
/// created inline inherit context via the task constructor. Use with
/// do_with to keep the guard alive across a .then() chain.
///
/// Not awaitable. For coroutines, use coroutine::context_guard.
template<typename Derived>
class scoped_context_guard {
task_context* _prev = nullptr;
protected:
lw_shared_ptr<task_context> _ctx;

explicit scoped_context_guard(lw_shared_ptr<task_context> ctx) {
if (ctx) {
_prev = internal::current_task_context_ref();
internal::current_task_context_ref() = ctx.get();
_ctx = std::move(ctx);
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

context_guard.hh uses internal::current_task_context_ref() unconditionally, but current_task_context_ref() only exists when SEASTAR_TASK_CONTEXT is enabled. As a result, including this header with task-context disabled will fail to compile (and tests/unit/task_context_test.cc includes it outside the #ifdef). Please wrap the implementation in #ifdef SEASTAR_TASK_CONTEXT and provide no-op guard templates (or stub current_task_context_ref() access) for the disabled build so this header remains includable.

Copilot uses AI. Check for mistakes.
@ballard26
Copy link
Copy Markdown
Author

ballard26 commented Apr 15, 2026

Here's the microbench results from the latest ver:

Benchmark ns (OFF) ns (hint) ns (Δ) insn (OFF) insn (hint) insn (Δ)
coroutine_test.empty 7.19 8.63 +20.0% 167 181 +8.4%
coroutine_test.without_preemption_check 7.79 8.72 +11.9% 184 198 +7.6%
coroutine_test.ready 8.13 12.09 +48.7% 191 227 +18.8%
coroutine_test.nested_ready_chain 37.95 45.30 +19.4% 751 863 +14.9%
coroutine_test.wrapped_ready_chain 40.96 51.16 +24.9% 816 945 +15.8%
coroutine_test.maybe_yield 7.89 8.83 +11.9% 171 185 +8.2%
coro_bench.empty_cont 2.19 4.63 +111.4% 51 72 +41.2%
coro_bench.ss_now 2.18 4.56 +109.2% 43 64 +48.8%
coro_bench.ss_now_collect 1.22 1.23 +0.8% 21 21 +0.0%
coro_bench.co_await_ready 9.52 16.02 +68.3% 222 278 +25.2%
coro_bench.co_await_ready_collect 8.81 12.11 +37.5% 198 229 +15.7%
coro_bench.nested_then5 2.55 7.09 +178.0% 49 112 +128.6%
coro_bench.chain_then5 10.59 36.08 +240.7% 179 664 +271.0%
coro_bench.nested_after_yield 55.19 55.55 +0.7% 702 773 +10.1%
coro_bench.chained_after_yield 138.86 140.94 +1.5% 2250 2401 +6.7%
coro_bench.coro_after_yield 63.66 60.48 -5.0% 968 1034 +6.8%
coro_bench.co_await_ready_nested3 26.81 38.76 +44.6% 582 710 +22.0%
coro_bench.co_await_ready_nested5 44.60 62.04 +39.1% 942 1146 +21.7%
coro_bench.empty_coro 9.02 12.51 +38.7% 203 231 +13.8%
coro_bench.never_awaits_collect 7.76 8.19 +5.5% 174 183 +5.2%
coro_bench.per_field_1 11.91 12.30 +3.3% 230 239 +3.9%
coro_bench.per_field_10 116.96 119.29 +2.0% 2259 2349 +4.0%
coro_bench.per_field_40 475.42 494.09 +3.9% 9010 9370 +4.0%
coroutine_test.unbuffered_generator 362.10 383.89 +6.0% 9265 9692 +4.6%
coroutine_test.buffered_generator 314.06 312.93 -0.4% 6701 6756 +0.8%

Will post the record_multiplexer benchmarks shortly.

Add an optional lw_shared_ptr<task_context> field on seastar::task
that propagates automatically through the task chain, gated by the
SEASTAR_TASK_CONTEXT compile flag. task_context is a polymorphic
base (enable_lw_shared_from_this + virtual dtor) that users subclass
to attach per-chain state such as tracing spans. The field adds
8 bytes to task when the flag is enabled and is compiled out
otherwise.

Propagation uses a thread-local raw pointer as the authority,
mirroring how current_scheduling_group() works:

  - Reactor loop: sets TLS from task._context before run_and_dispose,
    clears after the loop iteration.
  - Task constructor: calls inherit_task_context() which reads TLS
    and shared_from_this()'s it into the new task's _context.
  - set_current_task(): keeps TLS in sync with the current task for
    paths that bypass the reactor loop (parallel_for_each inline
    resumption, thread_context::switch_in, thread.cc blocking .get()).
  - smp: work_item uses a no_context_tag to construct its task
    without inheriting, since context cannot cross shards (lw_shared_ptr
    is not thread-safe).

The primary user-facing API is with_context(ctx, func, args...) —
a function wrapper that installs ctx as TLS for the duration of
func's synchronous execution. Inline work created by func (coroutine
promises, .then() continuations, parallel_for_each branches) inherits
ctx at construction time via TLS, and their own task._context carries
it across suspensions. The caller's task is never modified, so the
function is safe inside coroutines, long-lived driver tasks, and
anywhere else.

current_task_context() and set_current_task_context() expose the
TLS accessors publicly so users can build custom RAII or awaitable
helpers on top when ergonomics demand it.

Adds tests/unit/task_context_test.cc covering the primitive accessors,
propagation across .then() chains and coroutine suspensions, gate,
async, parallel_for_each, when_any, scheduling group switches,
as_future, try_future, maybe_yield, cross-shard non-propagation,
refcount semantics, and regression coverage for the set_current_task
and thread.cc sync points.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants